Largest Common Subsequence using Dynamic Approach
The core of the Longest Common Subsequence (LCS) problem is
finding the longest sequence that appears in both of two given strings in the same order, but
not necessarily consecutively.
The **dynamic programming** approach is commonly used to
solve the LCS problem efficiently by breaking it down into smaller overlapping subproblems.
This method is widely applied in fields like bioinformatics, text comparison, and
version control systems to measure similarity between sequences.
Video Lecture
Problem
Given two sequences, the goal is to find the longest subsequence that appears in both sequences
in the same relative order.
This process is referred to as finding the longest common
subsequence using a dynamic programming approach.
Input
Two sequences (strings or arrays).
Output
The longest common subsequence between the two sequences.
Example:
- String 1: ABCD
- String 2: ACBGD
- LCS: ACD
Types of LCS Approaches
1. Recursive Approach
This approach solves the problem by trying all possible subsequences. It is straight forward but
inefficient for large inputs due to overlapping subproblems.
2. Dynamic Programming Approach
This approach utilizes a table (2D array) to store solutions to subproblems, thus avoiding
redundant computations and achieving better performance.
Steps for LCS using Dynamic Programming
In the dynamic programming approach, a 2D table is constructed, where each cell represents the
length of the LCS for the substrings ending at that cell's indices.
The solution is built in
a bottom-up manner, following these steps:
- Table Initialization: Initialize a table where one string is along the rows
and the other along the columns. The first row and column are initialized to 0.
- Table Filling: For each character pair in the two strings, fill the table
according to the following rules:
- If the characters match, take the diagonal value and add 1.
- If the characters don't match, take the maximum value from the top or left cell.
- Traceback: After filling the table, trace back from the bottom-right corner
to construct the LCS.
By using dynamic programming, the time complexity is reduced to O(m * n), where m
and n
are the lengths of the two input sequences.
Problem
Given two sequences, the goal is to find the longest subsequence that appears in both sequences
in the same relative order.
This process is referred to as finding the longest common
subsequence using a dynamic programming approach.
Input
Two sequences (strings or arrays).
Output
The longest common subsequence between the two sequences.
Example:
- String 1: ABCD
- String 2: ACBGD
- LCS: ACD
Types of LCS Approaches
1. Recursive Approach
This approach solves the problem by trying all possible subsequences. It is straight forward but inefficient for large inputs due to overlapping subproblems.
2. Dynamic Programming Approach
This approach utilizes a table (2D array) to store solutions to subproblems, thus avoiding redundant computations and achieving better performance.
Steps for LCS using Dynamic Programming
In the dynamic programming approach, a 2D table is constructed, where each cell represents the
length of the LCS for the substrings ending at that cell's indices.
The solution is built in
a bottom-up manner, following these steps:
- Table Initialization: Initialize a table where one string is along the rows and the other along the columns. The first row and column are initialized to 0.
- Table Filling: For each character pair in the two strings, fill the table
according to the following rules:
- If the characters match, take the diagonal value and add 1.
- If the characters don't match, take the maximum value from the top or left cell.
- Traceback: After filling the table, trace back from the bottom-right corner to construct the LCS.
By using dynamic programming, the time complexity is reduced to O(m * n), where m and n are the lengths of the two input sequences.
Longest Common Subsequence Algorithm
LCS(X[1..m], Y[1..n])
Input: Two sequences, X of length m and Y of length n.
Output: The length of the longest common subsequence and the LCS itself.
Step 1: Start.
Step 2: Initialize a 2D array L[m+1][n+1], where L[i][j] will store the length of the LCS of X[1..i] and Y[1..j].
Step 3: Set the first row and first column of the array to 0 (i.e., L[i][0] = 0 for all i, and L[0][j] = 0 for all j).
Step 4: For each character X[i] in X and Y[j] in Y, do the following:
Step 5: If X[i] == Y[j], then set L[i][j] = L[i-1][j-1] + 1.
Step 6: If X[i] != Y[j], then set L[i][j] = max(L[i-1][j], L[i][j-1]).
Step 7: After filling the table, L[m][n] contains the length of the LCS.
Step 8: Traceback from L[m][n] to construct the LCS string by comparing X and Y in reverse order.
Step 9: Stop.
Largest Common Subsequence using Dynamic Approach code
#include <stdio.h>
#include <string.h>
// Function to find the length of the Longest Common Subsequence
int LCS(char* X, char* Y, int m, int n) {
int L[m + 1][n + 1];
// Build L[m+1][n+1] in bottom-up fashion
for (int i = 0; i <= m; i++) {
for (int j = 0; j <= n; j++) {
if (i == 0 || j == 0) {
L[i][j] = 0;
} else if (X[i - 1] == Y[j - 1]) {
L[i][j] = L[i - 1][j - 1] + 1;
} else {
L[i][j] = (L[i - 1][j] > L[i][j - 1]) ? L[i - 1][j] : L[i][j - 1];
}
}
}
return L[m][n]; // The length of LCS is in L[m][n]
}
// Function to print the Longest Common Subsequence
void printLCS(char* X, char* Y, int m, int n) {
int L[m + 1][n + 1];
// Build L[m+1][n+1] in bottom-up fashion
for (int i = 0; i <= m; i++) {
for (int j = 0; j <= n; j++) {
if (i == 0 || j == 0) {
L[i][j] = 0;
} else if (X[i - 1] == Y[j - 1]) {
L[i][j] = L[i - 1][j - 1] + 1;
} else {
L[i][j] = (L[i - 1][j] > L[i][j - 1]) ? L[i - 1][j] : L[i][j - 1];
}
}
}
// Following code builds the LCS string
int index = L[m][n]; // Length of LCS
char lcs[index + 1]; // LCS string
lcs[index] = '\0'; // Set the terminating character
// Start from L[m][n] and work backward
int i = m, j = n;
while (i > 0 && j > 0) {
if (X[i - 1] == Y[j - 1]) {
lcs[index - 1] = X[i - 1]; // Add character to LCS
i--;
j--;
index--;
} else if (L[i - 1][j] > L[i][j - 1]) {
i--;
} else {
j--;
}
}
printf("The Longest Common Subsequence is: %s\n", lcs);
}
int main() {
char X[100], Y[100];
// Input two sequences
printf("Enter the first sequence: ");
scanf("%s", X);
printf("Enter the second sequence: ");
scanf("%s", Y);
int m = strlen(X);
int n = strlen(Y);
printf("Length of Longest Common Subsequence: %d\n", LCS(X, Y, m, n));
printLCS(X, Y, m, n);
return 0;
}
#include <iostream>
#include <cstring>
using namespace std;
// Function to find the length of the Longest Common Subsequence
int LCS(string X, string Y, int m, int n) {
int L[m + 1][n + 1];
// Build L[m+1][n+1] in bottom-up fashion
for (int i = 0; i <= m; i++) {
for (int j = 0; j <= n; j++) {
if (i == 0 || j == 0) {
L[i][j] = 0;
} else if (X[i - 1] == Y[j - 1]) {
L[i][j] = L[i - 1][j - 1] + 1;
} else {
L[i][j] = max(L[i - 1][j], L[i][j - 1]);
}
}
}
return L[m][n]; // The length of LCS is in L[m][n]
}
// Function to print the Longest Common Subsequence
void printLCS(string X, string Y, int m, int n) {
int L[m + 1][n + 1];
// Build L[m+1][n+1] in bottom-up fashion
for (int i = 0; i <= m; i++) {
for (int j = 0; j <= n; j++) {
if (i == 0 || j == 0) {
L[i][j] = 0;
} else if (X[i - 1] == Y[j - 1]) {
L[i][j] = L[i - 1][j - 1] + 1;
} else {
L[i][j] = max(L[i - 1][j], L[i][j - 1]);
}
}
}
// Following code builds the LCS string
int index = L[m][n]; // Length of LCS
string lcs(index, '\0'); // LCS string
// Start from L[m][n] and work backward
int i = m, j = n;
while (i > 0 && j > 0) {
if (X[i - 1] == Y[j - 1]) {
lcs[index - 1] = X[i - 1]; // Add character to LCS
i--;
j--;
index--;
} else if (L[i - 1][j] > L[i][j - 1]) {
i--;
} else {
j--;
}
}
cout < "The Longest Common Subsequence is: " < lcs < endl;
}
int main() {
string X, Y;
// Input two sequences
cout < "Enter the first sequence: ";
cin >> X;
cout < "Enter the second sequence: ";
cin >> Y;
int m = X.size();
int n = Y.size();
cout < "Length of Longest Common Subsequence: " < LCS(X, Y, m, n) < endl;
printLCS(X, Y, m, n);
return 0;
}
import java.util.Scanner;
public class LCS {
// Function to find the length of the Longest Common Subsequence
public static int LCS(String X, String Y, int m, int n) {
int[][] L = new int[m + 1][n + 1];
// Build L[m+1][n+1] in bottom-up fashion
for (int i = 0; i <= m; i++) {
for (int j = 0; j <= n; j++) {
if (i == 0 || j == 0) {
L[i][j] = 0;
} else if (X.charAt(i - 1) == Y.charAt(j - 1)) {
L[i][j] = L[i - 1][j - 1] + 1;
} else {
L[i][j] = Math.max(L[i - 1][j], L[i][j - 1]);
}
}
}
return L[m][n]; // The length of LCS is in L[m][n]
}
// Function to print the Longest Common Subsequence
public static void printLCS(String X, String Y, int m, int n) {
int[][] L = new int[m + 1][n + 1];
// Build L[m+1][n+1] in bottom-up fashion
for (int i = 0; i <= m; i++) {
for (int j = 0; j <= n; j++) {
if (i == 0 || j == 0) {
L[i][j] = 0;
} else if (X.charAt(i - 1) == Y.charAt(j - 1)) {
L[i][j] = L[i - 1][j - 1] + 1;
} else {
L[i][j] = Math.max(L[i - 1][j], L[i][j - 1]);
}
}
}
// Following code builds the LCS string
int index = L[m][n]; // Length of LCS
char[] lcs = new char[index]; // LCS string
// Start from L[m][n] and work backward
int i = m, j = n;
while (i > 0 && j > 0) {
if (X.charAt(i - 1) == Y.charAt(j - 1)) {
lcs[index - 1] = X.charAt(i - 1); // Add character to LCS
i--;
j--;
index--;
} else if (L[i - 1][j] > L[i][j - 1]) {
i--;
} else {
j--;
}
}
System.out.println("The Longest Common Subsequence is: " + new String(lcs));
}
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
// Input two sequences
System.out.print("Enter the first sequence: ");
String X = sc.next();
System.out.print("Enter the second sequence: ");
String Y = sc.next();
int m = X.length();
int n = Y.length();
System.out.println("Length of Longest Common Subsequence: " + LCS(X, Y, m, n));
printLCS(X, Y, m, n);
}
}
def LCS(X, Y):
m = len(X)
n = len(Y)
L = [[0] * (n + 1) for i in range(m + 1)]
# Build L[m+1][n+1] in bottom-up fashion
for i in range(m + 1):
for j in range(n + 1):
if i == 0 or j == 0:
L[i][j] = 0
elif X[i - 1] == Y[j - 1]:
L[i][j] = L[i - 1][j - 1] + 1
else:
L[i][j] = max(L[i - 1][j], L[i][j - 1])
return L[m][n] # The length of LCS is in L[m][n]
def printLCS(X, Y):
m = len(X)
n = len(Y)
L = [[0] * (n + 1) for i in range(m + 1)]
# Build L[m+1][n+1] in bottom-up fashion
for i in range(m + 1):
for j in range(n + 1):
if i == 0 or j == 0:
L[i][j] = 0
elif X[i - 1] == Y[j - 1]:
L[i][j] = L[i - 1][j - 1] + 1
else:
L[i][j] = max(L[i - 1][j], L[i][j - 1])
# Following code builds the LCS string
index = L[m][n] # Length of LCS
lcs = [''] * (index)
# Start from L[m][n] and work backward
i, j = m, n
while i > 0 and j > 0:
if X[i - 1] == Y[j - 1]:
lcs[index - 1] = X[i - 1] # Add character to LCS
i -= 1
j -= 1
index -= 1
elif L[i - 1][j] > L[i][j - 1]:
i -= 1
else:
j -= 1
print("The Longest Common Subsequence is:", ''.join(lcs))
# Input two sequences
X = input("Enter the first sequence: ")
Y = input("Enter the second sequence: ")
# Calculate and display the results
length = LCS(X, Y)[1]
print("Length of Longest Common Subsequence:", length)
printLCS(X, Y)
Time Complexity of Longest Common Subsequence
Time Complexity Basic Operation: Table Filling
Analysis
The time complexity of the Longest Common Subsequence (LCS) algorithm using dynamic programming is O(m * n), where m and n are the lengths of the two input sequences. The algorithm uses a 2D table of size (m + 1) × (n + 1) to store the lengths of the LCS for all substring pairs.
Each cell in the table is filled in constant time O(1) by comparing characters from both sequences. Since we have (m + 1) rows and (n + 1) columns, the overall time complexity of the algorithm is:
T(m, n) = O(m) * O(n) = O(m * n)